Predicting Students’ Chance of Admission Using Beta Regression

Marcus Chery and Keiron Green

Invalid Date

Introduction

Beta regression is a type of statistical analysis used for modeling dependent variables that are bounded on both sides, typically between 0 and 1. It is particularly useful for variables that represent proportions or percentages.

  • Definition: A statistical technique for modeling data that follows a beta distribution.
  • Key Characteristics:
    • Deals with continuous variables bounded between 0 and 1.
    • Suitable for rates or proportions.
  • Uses:
    • Ideal for modeling rates and proportions in finance, biology, and social sciences.
  • Importance:
    • Provides more flexibility and accuracy for bounded data compared to traditional regression.
  • Mechanics:
    • Models the effect of predictors on the mean of a beta-distributed response variable.
    • Incorporates a link function, similar to logistic regression.
    • Estimation typically done through maximum likelihood methods.

By Pabloparsil — Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=89335966

Methods

Assumptions

  • Link appropriateness (“deviance residuals vs. indices of observation”, at least for the logit link)

  • Models continuous random variables and assumes values are in (0, 1), such as rates, proportions, and concentration or inequality indices.

  • Dependent variable is beta-distributed.

  • Missing and outliers should be addressed to allow for model fitting.

Equations

Beta regression model function: g(μ_i )=x_i^T β=n_i where β=(β_1,...,β_k )⊤ is a k × 1 vector.  (x_(i1   ),....,x_ik)⊤ is the vector of k regressors (or independent variables or covariates) and n_i is a linear predictor.
Link function g(μ)=log⁡(μ∕(1-μ)). 

Data

Description:

  • This dataset was built with the purpose of helping students in shortlisting universities with their profiles. The predicted output gives them a fair idea about their chances for a particular university.

  • This dataset includes various information like GRE score, TOEFL score, university rating, SOP (Statement of Purpose), LOR (Letter of Recommendation), CGPA, research and chance of admit.

Variables - Attribute Information

The table contains a brief description of the dataset.

Attribute Information

Variable Parameter Range D escription
GRE Scores gre_score

290 - 340

(340 scale)

Quantifies a c andidate’s p erformance on the Graduate Record E x amination, with a maximum score of 340
TOEFL Scores to efl_score

92 - 120

(120 scale)

Measures English language p r oficiency, scored out of a total of 120 points
Un iversity Rating universi ty_rating 1 to 5 with 5 being the highest rating Rates u n iversities on a scale from 1 to 5, indicating their overall quality and r eputation.
S tatement of Purpose (SOP) Strength sop 1 to 5 with 5 being the highest rating Evaluates the strength and quality of a c andidate’s SOP on a scale of 1 to 5
Letter of Reccomm endation (LOR) Strength lor 1 to 5 with 5 being the highest rating Evaluates the strength and quality of a c andidate’s SOP and LOR on a scale of 1 to 5
Under graduate GPA cgpa

6.8 - 9.92

(10.0 scale)

Reflects a student’s academic p erformance in their un d ergraduate studies, scored on a 10-point scale
Research Ex perience research 0 or 1 Indicates whether a candidate has research experience (1) or not (0).
Chance of Admit chance
_of_admit

0.34 - 0.97

(0 to 1 scale)

Represents the likelihood of a student being admitted, expressed as a decimal between 0 and 1

Analysis and Results

Choosing Best Fit Model

Conclusion

  • The application of beta regression models has provided valuable insights into predicting student admission probabilities. 

  • The analysis highlights the significance of various factors, such as GRE scores, TOEFL scores, CGPA, university ratings, letters of recommendation, and research experience.

  • The models that combined these predictors with the lowest AIC values and the highest pseudo R-squared scores were deemed most effective.

  • This study not only contributes to the academic understanding of factors influencing university admissions but also offers practical implications for educational institutions and policy-making.

References